
A DOCPHOENIX 



Raw Sequence Listing EiWr Summary 

.RROR DETECTED SUGGESIi^CORRECIlON ' SER.AL mmj ^/H ^ 

„.^«on Pwn ISH -ALPHA- HEADERS. WHICH WEREINSERTED BY PTO SOFTWARE 
ATTN: NEW RULES CASES: PLEASE O.SREGARO ENGUSH ALPHA HE ^^^^ 



Wrapped Nucleics 



Wrapped Aminos 



3 Incorrect Line Length 

4 Misaligned Amino Acid 

Numtiering 

5 Non-ASCII 



Variable Length 



Patenlln ver. 2.0 "bug" 



Skipped Sequences 
(OLD RULES) 



10 



11 



Skipped Sequences 
(NEW RULES) 



Use of n's'or Xaa's 
(NEW rttjLS^)' 



The numberAext at the end of each line "wrapped" down to the next hne. 
This may occur if your file was retrieved in a word processor after creating it. 
Please adjust your right margin to .3, as this will prevent "wrapping 

The amino acid numberAext at the end of each line "wrapped ' down to the next line. 
This may occur if your file was retrieved in a word processor alter creating it. 
Please adjust your right margin to .3, as this will prevent "wrapping". 

The rules require that a line not exceed 72 characters in length. This includes spaces. 

The numbering under each 5lh amino acid is misaligned. This may be caused by th^use of labs 
Lree^e numbering. I. is recommended to delete any tabs and use spacingTetv^een the numbers. 

This file was not saved in ASCII (DOS) text, as required by the Sequence Rules. 
pSsTe'ure your subsequent submission is saved in ASCI text so .hat it can be processed. 

Senuence(s) contain n's or Xaa's which represented more than one residue. 

As per the rules, each nor xaa can only represent a single residue^ 
Please present the maximum number of each residue having variable length and 
indicate in the (ix) feature section that some may be missing. 

A "bua" in Patentin version 2.0 has caused the <220>-<223> section to be missing from amino acid 
A bug in Patentin ^ ^^^^^^^^ ^^^^ automatically generate this section from the 

sections for Artincial or Unknown sequences. 

Sequence(s) __ missing. If intentional, please use the following format lor each skipped sequence: 

IJSSHArfER— 

(xi) SEQUENCE DESCRIPTION:SEQ ID NO:X: 

This sequence is intentionally skipped 

Please also adjust the "(iii) NUMBER OF SEQUENCES:" response'.o include the skipped sequence(s). 

Sequence(s) missing. If intentional, please use the following format lor each skipped sequence. 

<210> sequence id number 
e400> sequence id number 
000 

Use of n's and/or Xaa's have been detected in the Sequence Listing. 



12 



Use of <213>Organism 
(NEW RULES) 

Use of <22C>Feature 
(NEW RULES) 



13 



Sequence(s) . are missing this mandatory field or its resporise. 

Seauence(s) are missing the <220>Feature and associated headings. 
2 o( <2To>"^223> is MANDATORY if <213>ORGANISM is "Artificiar or "Unknown 

(Sec.,.e230fnew Rules, 
SdTease use "fL Manager" or any other means to copy file to floppy disk. 



AKS-Biotechnology Systems Branch- 5/15/99 
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RAW SEQUENCE LISTING DATE: 01/30/2001 

PATENT APPLICATION: US/09/7 01,626 TIME: 11:15:33 

input set A:\Neb-16 5a,app DOeS NOt COITIDIV 

Output Set! N:\CEF3\01302001\l701626 .raw ^ ^v/i vwi i ijjiy 

Corrected Diskette Needed 

3 <110> APPLICANT: New England Biolabs, Inc. 

4 Vaisvlla, Romualcias 

5 Morgan, Richard D. 

6 Raleigh/ Elisabeth 

8 <120> TITLE OF INVENTION: Restriction Enzyme Gene Discovery Method 

10 <130> FILE REFERENCE: Gene Discovery Method 

12 <14 0> CURRENT APPLICATION NUMBER: US/09/701,62 6 

13 <14i> CURRENT FILING DATE: 2000-01-12 

15 <150> PRIOR APPLICATION NUMBER: 60/089,086 

16 <151> PRIOR FILING DATE: 199B-06-12 

18 <15 0> PRIOR APPLICATION NUMBER: 60/089,101 

19 <151> PRIOR FILING DATE: 1998-06-12 
21 <160> NUMBER OF SEQ ID NOS : 130 
2 3 <170> SOFTWARE: Pa ten tin Ver . 2.0 

2 5 <210> SEQ ID NO: 1 

26 <211> LENGTH: 1414 3 

27 <212> TYPE: DNA 

28 <213> ORGANISM: Unknown 

30 <220> FEATURE; 

31 <22 3> OTHER INFORMATION: Genomic DNA of Pseudomonas Alcaligenes NEB if 585 

32 (ATCC 55044) 
34 <400> SEQUENCE: 1 

3 5 aticgatcagc cagacttttc gcacacgggc ggaccttggg cgagtcagcg ctatggttgg 6 0 

36 ccgctgtggg ttgtcagtgc ccgtacgcgc aatctgtttc tttcgcaggg catgtccggc 120 

37 tgggcgttcc ggcccgttct ggtcaccgac tcggctctct atgagcgcta tctcgctcta 180 

38 agtcaggaac tttgcgcact gcttcgtgat gcaccgcaga gcaagctcga agaccgtgat 24 0 

39 tggtaagcgg gggctattcg atcagtctcg gagcgaccaa actccagaaa cgacaaggcc 300 

4 0 ctgaaaaaaa agcagggctt cgtctttgcg ggcgaatgga atcggacctc tttccgcctc 360 
41 tgcatgtaac tggtctttgt ttgccaaatc tgcctatctc atgccggcca tgttggccag 420 
4 2 tgcctgcatc atttggcctt tggtttcgac actttttcga cage cot get agacatccct 4 80 
4 3 ccctctgccc tcgtaacttc tgttccgatg gtgtcgcttg gcactatggt cttgtcgagt 540 
4 4 gtcgcttttc atccagccta atgccgcgat tgcctcgctg agctgtagct gaatcaagga 6 00 
4 5 cttagcggac gacaaggaat gttatgcgaa acatgLggcg gaataaatta cgccgcatgt 660 
4 6 ttcgtctact tatagttagg ctacatatga gaatcagcgc agaccagctt gctcaagaat 720 
4 7 cactgactga gttcggcgtg ctggcggcta agcttctggc aacgcgagag ctUagccagt 780 
4 8 tgtccgagaa gtttgggtat gcactggcct tcggaaggga accggcggct gccatagctg 84 0 

4 9 aggaccttgc taggtgcttg tgcggacaaa atgcttcgcc ggcatctgaa taccccaaaa 9 00 

5 0 tcaccgttaa gtatttcaag gaaaacgaaa gtagtctgtt ggcactcgta gagtgttatg 96 0 

51 tacaaatgac cgcaagcgca aacattcttt tagagctggt tgccgcacga aatggagagg 1020 

52 caataaatct gtatctagaa ggcLtgagLg ttgtagccta acaatgcgct caaagcgctc 1080 
5 3 acttcgttcg ctgggaccgg cgaagccggc cocttagcLt aatcgttaga aaccatcatg 1140 
54 gataactggt acaacaccat cgaataccaa acccatgtag ccgaaaaact agaggcacLL 12 00 
5 5 ggagaaacaa agtacgaccg cgaggcttat gaattcgcgc tagaggcata ccagtatgcg 12 60 

56 cctgaatatc atgaaaatat tcccacgccg cctctcaato ttgggctcgc gtaccatgta 1320 

57 agcgccttca actttgcaca ctgctatgta cttcacgcLa aagaagtgLt Lgaagctcca 1 3 80 

58 aaagacacac tgagctoctg gggcgLatLt toctcaacgg acattggtga aattgtttat 14 40 

59 ggtttagtcc gtattggctt gctggaccaa ggccccgaag acaaaaaaga gcagtttgaa 1500 



file://C:\CRF3\OuthoId\VsrI701 626.htm 
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RAW SEQUENCE LISTING DATh: : 01/30/2001 

PATENT AP PLICATION : US/0 9/7 01,626 TIME ; 11:15:3 3 

Input: Set : A:\Neb-165a.app 

Output Set: N:\CRP3\01302001\l701626.raw 

60 gggttrjtttt taatcaccga cgtgctgtga tgtcttctaa ctactggttc aagtcgttcg 1560 

61 cttcgctoac tcgggaccgg ctaaagccgg ccccttaacc aaacgttagc cacctcacga 1620 

62 aqatttggag cccgcgtgaa caaaqtcgat acaaacaaaa ttaaaacgga tttttcggca 1680 
6'i cgaattgatg aaaaaagagc gtggtttgat cqtatggcta cgcttataag cgggacaaac 174 0 
6 4 accgagttaa ccgaccttaa ttttctttgc gagaactaLa taacatcaat atacgtagag 1800 
65 ctcgaatgct taatatoaga tttatttcat ggctacataa ataacaacaa caagacctac 1860 
6 6 atggcgcaca ttcaatcaaa aatcaagaac tccataactg acaagtactc tgcatggcac 1920 
6 7 gocacccata caacattcgc aggtccagag catattaatt cagcacagct cagcacgctc 1980 
68 cttgatccaa caagctggaa catcacattt aaayacgttt ccgcaatgaa aytacgagca 2040 

6 9 aaggaatacc tttcctcagt acacgaaaaa agattttcag gtatatctgc atccgatgga 2100 
70 gcLcttattg atgccgcaca tgcaatcaga aaLtgcattg cacacaacag cgaaagctcc 216 0 
7:1 agaaaggtta tgaacaccaa aattaaaagc ttaatLacag gcccagcttg ctcaaatgtc 2 2 20 
72- ggccttgaac tcaccacaaa tagtgtgacc aaaataggaa agtatctccg tgcaaatgct 22 8 0 
73 cagcaaagca tgcgagtgct gatttactca gatcgaataa aatctatcgg cctaagctta 2340 

7 4 taagtgtqgg ctaacaatgc gctcaactgt cgctcacbtc gttcyctgga cagccaaaag 24 0 0 
7 5 ctacgctttt gtctgcccgt tagcttaatc gttaggaggc tctgcatqac tcgtqcaaca 246 0 
7 6 gacaggtLcg aagagcttct gcaatcacat gagttctcag ggcatattat tcgttgggtt 2520 
7 7 gcgatattcg aaggccgtct tgacggLgtg Ltatcagttc atttttctgg acttgaaagc 2580 

7 8 acctatgaat tctacgaact catactttcc aggttgtctt tctacgaaaa aattgaaatc 2640 

79 ctgaqaaaaa ttgattttgg taacagtctc aaatcccaag aaaatacagc gctgcaccta 2700 

80 gacaaactga ggcgattgcg taacgcattg ycgcatgcag cacaoatgcc acctgatgaa 276 0 

81 atcatgaagt tgtgctctga taagtggata gagtcctttg tgctcggata tccaaagtcc 28 2 0 

82 attggcaaag agaaaaatgc acttgaaaat cggctatcac ttctgtggaa ttactgccac 28 80 

83 aggaggcatg tagcaaaaat taagcatjctt gcacacgaac Lcaaaaatac agagcaagcc 2940 

84 aactaataga gtccagttat acaggtccgt aaatgagccg cctaacaact ggLtcaagcc 3000 

8 5 actcacttcg ctcgctcggg accgcgttcc gcggcccctt: aaccaaacgt: tgggcaccca 306 0 

85 tagaaaaatc ctaatigagaa a acta t teat arcactaatt ttcgccctgc tatcggagag 3120 

87 cttgatggca tctgaagcgt ataaggacct tgaaacacaa gtaactgaaa aagccagcct 318 0 

88 agcagttgcc caaatgaatg acagagcaac tggaaagctc gactactcgg aagaaagtct 324 0 

89 ctatgcagta gaagaaatgg cagcggaagc agctcaatac aaagatcaat tagatccagc 3 300 

90 cacUgtagac Leg ct tact c aagttct:tgg aagctatatt cttyaggttg cacatagaaa 3 36 0 

91 gcatggcggc tottacgttt ggcttgaatc tgaaaactca cctgccttgy tagttggtga 34 20 

92 accagagtac aggotagcac tctcaacctt cgccaaggta catggccgac tttctggcga 34 80 

93 cgaagcagat aatcttattt tcttctatca aggcttttct: gaaaggctta aatcaccatc 3 54 0 

94 t.cccggcat-.g agcgcactct acaaatgaaa cccgag|-.ttg ggggcccaac aatgcgctca 3 6 00 

95 actgccgctc acttcgttcg ctggaogtcc aaaagctacg cttttggccg cccgttagct 3660 

96 taatcgttat gcocaataaa acatgaagac agcactcata tttgtagctc taatctttct 3720 

9 7 ctctggatgt gacaactatc agtcatgccc tataactggd aaatggaaat ccaacgaaaa 3 7 80 
9 8 gctaact'tta gaaagcatga al:gaaaccgg caggataacg gcaaagcaaa gaga gat. ttt 3 84 0 

9 9 tgagaacggc ttctttggaa aactagaatt agacataaat: tgcagtagct: tcacaacaat 3 9 00 

10 0 acttgacggc gttaccgaaa cctttaatLa cgagatagtt cgccaaacaa aagattccgt 3 960 

101 caccgttagc tattacagca aagcgctgca aaaacaagtt gaggtcacat ctattatcaa 4 020 

102 cggaaattgt tactcgacac ctatagagca gttaaatttc aatgagtatt tctgcagagt 4 080 
10 3 cgagtagcgc ataacaattg atLcaagtcg ttcgcftcgc tcactgcggg accggctaaa 4140 
104 gccggcccct taaccaaacg ttaggcaaag gctcaatgga tcccatattc cataacatcc 4200 
10 5 atagaaacga caaagagatt gagggcgctc atcaacaatg ctcgagcaca atcaatcact 4 26 0 

106 tcattgagat ggtcaaaaaa gggggcgagc coaccLatat ggcaaagcta cgttttcttg 4 320 

107 accctgacaa gtctgaaaaa gaaggtaaga aLcatatttt ttatttgtgg ttatctgaag 4 380 

108 tgctgtacca ccctgcaaca aatttacLtt ctggggtatt ttttgaaatc cctgaaggct 4440 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/7 01,626 



DATE: 01/^0/2001 
TIMK: "II; 15: 33 



109 
110 
111 
112 
113 
114 
115 

11 G 
117 
118 
119 

12 0 
121 
122 
123 
124 
125 
126 
127 
128 
129 
130 
131 

13 2 
133 
134 
135 
136 
1 37 
133 
139 
140 
3 41 
142 
143 
144 

14 5 
146 
147 
148 
149 
150 
151 
152 
153 
154 
155 
156 
157 



ttgaaaagtg 
ggatggtaat 
gcttaaccac 
agtagcctaa 
gctcgttcgg 
ggtcttcagc 
dtgqtcqctq 
gtgcacytca 
gctcggcgct 
ttcgcttcgc 
ccatgaatag 
aagcggcaa t: 
tcgca ttcqg 
aatcaggaat 
aagcactttt 
aaatqcaacc 
cacaaaacgc 
LacgcLtccg 
aagcraatac 
catcg tycag 
taa tatcgtt 
atttcgctgc 
cgccccactc 
agacag Ltaa 
cggctaaagc 
caagg Lgcca 
agggttgcat 
ccttcaccaa 
attycggtat 
aca tacagLg 
ctcggctcgc 
tggacgttct 
tcagtcaayc 
atgaataaaa 
ttactgcctt 
gccataaaca 
ctgttaccca 
cagatqaata 
ctcttttatic 
tttttataaa 
ttt:ccggcac 
aqttcttcca 
tacgcattgt 
tcgaactgLa 
gccggatcga 
tcqctcactc 
atcactccaa 
cactcccgaa 
tgtaggtaac 



Input Set : 
Output Set : 

gcaccaaa l".a 
cgacaaaggt 
cgagcaaqaa 
aat^.aaycqc 
gaccqcgttc 
accgagtcag 
dcagUtgaqy 
caaataactt 
ggcatcattt 
tcactcggga 
cgaagaatta 
t tea act tac 
agtttgcctc 
tgcactgaag 
taaatcaggc 
tgagtaccca 
dtaacaattg 
gccgcccctt 
cagcgcgcaa 
tcttcgttqc 
cjttgqtattg 
cctgactgtg 
ctaaqgctqt 
agttaccgcc 
cggcccctta 
gcgcaatctc 
gacttggcaa 
agcgacattt 
atcggaggca 
tttgggtgtt 
ctgttttata 
aggtgaggcc 
ggacgcaaac 
gcctccacac 
agtattaatt 
gccgccgaac 
tgqcatagaa 
tatqctca tg 
acccacttta 
atctactgca 
aatggataga 
tgccagctac 
ttatgcattt 
tatttctatt 
tccatgtgaa 
gggaccggct 
gaactccaat 
acatgaagcc 
tttacacatg 



A:\Neb-16 5a .app 

N:\CEF3\013 02001\I701626 .raw 



ggccagcgcc 
caLgc taagg 
agaaaagatt 
tcacgcctca 
cgcggcccct 
gaaacacaa t 
ccgttatttq 
ccggggccaa 
tcggcgctcg 
ccggcgaagc 
tacaaaaagy 
aaagaaattq 
caaaaa tg tg 
cctcactatt 
aaaaaagcjcc 
agttatgagt 
gctcaagccq 
agccaaacgt 
catcgctcat 
tqccattgac 
cctgttttgt 
ggaacgaag t 
gcaagcactg 
taacaaatgq 
aocaaacgtt 
gcagctgacg 
aacctaagct 
ttgqcgtqgt 
ggttttacat 
gatcgctcag 
tgoaacagga 
atttcaaaag 
cccqctgcgc 
atagccagct 
ttgatgattt 
aa ttcaagcg 
gaqggctttg 
cagcaggctt 
ga taaaagca 
ggtaagaact 
gttcgatcac 
acaaaaaaca 
ggagtggaat 
atagaaaaag 
ca tcagcgtt 
aaagccggcc 
cgccgattga 
a gat tat tot 
agttatocag 



taggctttga 
gtgcatacac 
ttgaccgcta 
qcctaactac 
taaccaaacg 
caccgaatca 
tggccagcaa 
aaccgaaacg 
yctggycaat 
cggcccctta 
c Latggagtt 
ttaagaaatc 
gtcactggaa 
gcgagggtga 
tLgcgataaa 
ctgtgcaaaa 
ctcgctccgc 
taggcaccac 
cttcttagtq 
qgatgtcgcc 
tatagg tagt 
agatcagagc 
cgatattcta 
ttcaagtcgt 
aggcaacagg 
agttctatgc 
acaacgcgaa 
acggcttatc 
ccttgcacag 
attcgttcga 
taaaatcaaa 
cgtggcatgg 
ggtctttgcg 
ttacgggaac 
ctggg tgctc 
acacgaccgc 
ccaagtcact 
ttagcaatct 
tgaacaccgc 
tgctaacagc 
taccctccga 
ctttaacagc 
ccatgtgcaa 
gcaaatgcca 
aca tctaaca 
ccttaaccaa 
ctcagcaatc 
aacgttggta 
tcctgagggq 



tccagaagat 
actaaaggta 
tattggtgtq 
tggttcaaq t 
ttaggcgcaa 
gcgcggtgtt 
aggagttgct 
ccgtgcgctc 
ctaacaattg 
gccaaacgtt 
agag tccaaa 
taacqa tcct 
gcaatccatc 
tgctcgtcta 
gcaatggcaa 
tgaagccaag 
tcactcggac 
atgccctcca 
tcaggcgctg 
cgccaagaaa 
gcgqcaaagc 
ttacctacag 
tggaatgttg 
tcgcttcgct 
gggtgacatg 
tggctctagc 
taaggaaggt 
agcacagcgg 
gactaatcca 
aggctattca 
ca tcttcagc 
tcyagqaatt 
ccycttatct 
gaagttgatg 
ctccacatca 
catactgaaa 
tgagcaagqa 
tga tctaaac 
agacgtaaaa 
aggagagtct 
acagcagtcg 
catgggggct 
ttacgctatg 
ataaaccaca 
tgtggttcaa 
acgttagagg 
gtgaactcca 
cgcgaatcta 
catgcaccag 



g tctttgatt 
tcgcgagagc 
gcgtcatatg 
cactcgcttc 
ygqcaatatt 
cctgaa tcga 
ttcagagaa t 
cgccggttaa 
gctcaagtcg 
atgcgagcca 
tgcgagcata 
cyacacttca 
gagqta ttag 
ttt:ttagcaa 
catg tatcaa 
aaaatgcttq 
gtccgtaagc 
tcaag tcagc 
cttgqcta tt 
tyqtttgcct 
g tcagcgaga 
agggtgatgg 
gcaagacccc 
cactcggqac 
acgcaatgtc 
aaaatgtgca 
catgccaaca 
cattqtgggt 
cgcggctacc 
cctcaaaacg 
gccag tgaga 
gcctaactat 
caaycgttag 
cgacgcctct 
aaaaotgaaa 
tatatattta 
agttacacac 
aga ttaacca 
cat t tea tga 
agcacttctt 
aaaataaacg 
ccagaggcgg 
cgcaataatt 
aaaaacaaag 
gccgctcgc t 
a ttaca tqcc 
tgattgagag 
attcctcttg 
ttggtccggc 



4 5 00 
45 6 0 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
52B0 
5340 
54 00 
5460 
5520 

5 58 0 
564 0 
5700 
57 60 

5 82 0 
5880 
5940 

6 0 00 
GO 60 
6120 
61B0 
6240 
6 3 00 
6 3 60 
6420 
6 4 BO 
6 540 
6 6 00 
6660 
6720 

6 7 80 
6840 
6900 
6960 
7020 
7080 
714 0 
7200 
7260 

7 3 20 
73B0 
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RAW SEQUENCE LISTING DATJil : 01/;30/2001 

PATENT AP PLI CAT J ON : US/ 09/701, 626 TIME: 11:15:33 

Input Set : A:\Neb-16 5a.app 

Output Set: N:\CftF3\01302001\l701626.raw 



15B aqaqagctta 

159 cttattcacc 

160 tgcjgtttgaa 

161 tcgggatcqq 

162 actaaccttc 

163 tqgccaaaog 

164 atttaqcatq 

165 aggctactta 

166 aatacttctt 

167 aaqttaataa 
16 8 ctagttqctc 

169 cagatcgttc 

170 qgcaactgaa 

171 tacgaagcct 

172 atggatctaa 

173 qagqtcgaat 

174 gcgtcagaga 

175 cgcgaacaga 
:i 76 gtgctgtcag 

177 aaagccggcc 

178 agaagtcgaa 

179 tgaaaattcc 

180 gccaqggaaa 

181 ctcagaagac 

182 aaagcgaaqc 
i 8 3 cgcatggcca 

184 gaggggettt 

185 caaa tacatc 

186 tttcttagat 

187 tcgctcactc 

188 tttgcatggg 

189 ctccaatact 

190 aaqoctgcct 

191 gqaagtctgg 

192 gagccaatgg 
19 3 caaaaatgga 

194 atcqtgtagt 

195 atctagggct 

196 ccgacatgca 

19 7 ttgaaaacag 

198 ttttagttgg 

199 gctcacttcg 

200 gacttccgac 

201 ccgaaaqgag 

202 gtgaagaatc 

20 3 cgcccggcgg 

204 atggtatatg 

205 catcgacctt 

206 atcatggctg 



tttgaagaca 
aaggcaa ttt 
tatgcg tagc 
cgaagccggc 
gggctactca 
gtaatcgctt 
ctgatacaag 
tgcaagaaca 
ggattaattg 
tcgcaacatc 
ctgccggcta 
gcttcgctca 
tgatcacctg 
caactaatgt 
ctgtcaaagg 
gctttgctca 
a tcaactaaa 
gcaggctcca 
ccgcctaact 
ccttaaccaa 
ctaa tagaga 
cacttcag tg 
tggccaggca 
tctcaaaaca 
ctcgataaag 
gg tcgaaaca 
tcgagaatcg 
ctttcctttt 
gaggaattca 
gggaccggct 
aaaccagaaa 
gaagcaa tgg 
gcactggtat 
aggttttgcg 
tcgtgtcgac 
atggctctac 
tgaaaaagca 
tgggctttcc 
aqcccttaga 
ccccgacaac 
gacggcgcac 
ttcgctggga 
catcatgagt 
tggctttagg 
agaaacagcc 
aaaaaggtat 
gatgtgccaa 
a ttaaatcac 
ggatattttt 



cttaccaac t 
atcgggctaa 
a tccctctaa 
accttaacca 
tagcaattct 
ccgacqtgqc 
ccaoatcga t 
agccagcaca 
tattcgcaat 
tgctttaaca 
tcttcttgca 
ctgcgggacc 
cattccggca 
agttaagttg 
taaatctgtc 
aattgggctt 
gctcataggc 
agttgtaa tc 
actggttcaa 
acgttaggct 
gttttgggaa 
gcgacgacaa 
ttcaagtcga 
ccatcaacaa 
agaaagagtg 
acaaaacag t 
acaaacagac 
ctacagcatc 
gcccgatgat 
aaagccggcc 
atgga tttgc 
atcagcatca 
aacgaaaggg 
ctggaagaat 
ctatatttct 
ttcggaaagc 
aagaatgaca 
tttatr.tgca 
gcgct tatgc 
gaaattgtta 
agctgaatcg 
tcggctaaag 
gatagagacg 
tgttcttatc 
gtagcaagaa 
gacccaacat 
acgcattcag 
tggaaaaata 
gacaaatacc 



cgatgagtta 
tgcggttggg 
caattggttc 
aacgttagag 
agctagtatc 
ttta ta tact 
gctggoatgq 
acatggagca 
gcgagcttcg 
caaaaagcaa 
aagcatcgtg 
ggotgaagcc 
cgtgaa ttcc 
cgtgtatggg 
ctt tgtqcgg 
ccgaacqtaa 
a tqgdaccac 
ttgtacgagg 
gtcgttcgct 
ttcaatgaaa 
a Ltcctgato 
ccataatgca 
taaactacac 
aatattcggc 
cttagctata 
aacaagaa LI: 
gtttgttaaa 
aaacaoattg 
cagcctaaca 
ccttaaccaa 
agataaacga 
acaaagaata 
caaatataag 
attcaactaa 
ccaatgacaa 
gcacaagact 
taattaacag 
catact. ggaa 
aaaattataa 
gctcaaaggg 
tgtgtgtgcq 
ccggcccctt 
aattttctgc 
ttggttgctc 
cgggggtggc 
taagccctac 
ttgaaataga 
tatccgagag 
ccttcc ttca 



ttctacagcc 
gacgg ttggt 
aagccgctcg 
atggtcatga 
atttcacaag 
cca tatttcc 
gcca tacctg 
aaattggggg 
acccaa ttaa 
aatattcagt 
caaa tctcta 
ggccccttaa 
tgcgtaaagt 
cttqtggata 
ttgcgggag t 
tt:cagttagt 
caatcgaact 
g tcaggtaaa 
tcgctcactc 
acag ttccaq 
aatcaagact 
gatgtagcct 
atagaagtaa 
caattactaa 
ttgttccctt 
gaaggtgaag 
ttt ggtgact 
aacgtatttg 
aatggttcaa 
acgttagacg 
tacaaagg tt 
cgtcaagcaa 
tgct tttgct 
aaaaggcacc 
cgagcaagcc 
agatttcaaa 
cctgcatgcc 
aaagggttat 
ctqcgcatt t 
caata Lctgc 
tctaacaatg 
agcttaatcg 
cccaacaaaa 
taatgcaacc 
gtgtcatata 
ggaacgaaqc 
tagagatgag 
ccgagcagat 
tattgacLcg 



a tggtgagcg 
catatcacgc 
cttcgctcgc 
a taaacgcgc 
cgctgcttca 
ta tc tccaa t 
gact Lta tgt 
cagcata tgg 
cgtagattct 
gcacLttgcg 
acaa Ltggtt 
ccaaacg tta 
atgcggcctg 
tggca tcgca 
actccgccaa 
aggcgacaag 
tea t ate tct 
qgctacata t 
gggaccggct 
tgaaaatatc 
taatcgacta 
tatct ttaaa 
agtca caeca 
aagaaaccgg 
acgagcgcgg 
cttattaccg 
tggtcggtgc 
a a tggaaaaa 
gccgttcgc t 
caccggaaat 
<?^^7tgggtt.t 
ta tga tttca 
ggtgccgttt 
gaagaaaacQ 
attgttgaag 
gaaaaaatag 
aacccctacg 
gacgcatccg 
t:-atgcaattt 
aacgc tgtga 
cgctcaaagc 
ttagcactag 
agagcgctag 
atagggcc ta 
actgccgcag 
tcaatctcga 
gcccgataca 
ta tg caaa a a 
ctagccaaca 



7440 
7 5 00 
7560 
7620 
7680 
7740 
78 00 

7 8 60* 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 

8 700 
8 7 60 

8 8 20 
8880 

89 40 
9000 

90 6 0 
9120 
9180 
9240 
9300 
9360 
94 20 
9480 

9 54 0 
96 00 
9660 
9 7 20 
9 7 80 
9840 
9 9 00 
9960 
10020 
10080 
1014 0 
10200 
10260 
10320 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/701,626 



DATE: 01/3 0/20 01 
TIME; 11:13:33 



207 
208 
209 
210 
211 
212 
213 
214 
215 
216 
217 
218 
219 
220 
221 
222 , 
22 3 ' 
224 
225 
226 
227 
2 2B 
■ 229 
2 3 0 
231 
232 
233 
234 
235 
236 
237 
238 
239 
240 
241 
242 
243 
244 
245 
246 
247 
24 8 
249 
250 
251 
252 
2 53 
254 
255 



tagQcctgqc 

acagctgcct 

tagaacttta 

cgqagcaaaa 

accaccagaa 

gtaactLtgt 

ctgactttta 

acaagqccct 

taccgttgca 

aacctaatgg 

caaLtataqa 

aacLgtcgcL 

ttaatcgtta 

tcagcggttt 

ttcgcccact 

qtqaaa tgct 

aatcatattg 

cggaaaagcc 

aaatcacccg 

tgcaataagc 

ccggcccctt 

tccagacgaa 

tatcttgcga 

gccqttatga 

t tcqattgga 

gaccttgaag 

at tttcaata 

qcaatagatg 

ggctaaagcc 

atgtattggg 

gtt.gaactaa 

qctcqtcaaa 

ccgaaacatc 

gaaaataccc 

cctacccgcg 

tgaat.gccga 

acagaaatgc 

agcgttatga 

attqa ncctt 

ttcaagccqc 

aactatcaga 

cgggttttct 

aagttggaqg 

ctatggtt.Lt 

tgcagcagct 

agccgagcac 

gct.cggcaat 

actgatagc t 

gacccgaagc 



InpuL Set : 
Output Set 

tcttaccaaa 
ccctcaacta 
tcgaaa tgcc 
actagaaata 
tgcaaacggt 
agtcaacta t 
cgcqct Lgca 
agagttagac 
tttttgtaqa 
aaagccagt t 
caggcttccc 
cacttcqttc 
qgcgcaagga 
tggcgggcag 
gggcaagtca 
agatattttt 

qCtaqq tcqc 
agccataaca 
etc tac teat 
gcctaacaac 
rjaocaaqcq t 
atgca tcatJ: 
tacagcagct 
ctctattgcc 
ttgtcagaga 
caacgatqat 
tttactcgtc 
agtaagcatc 
ggccccttaa 
cacaaacaac 
aatgcttata 
gcaaaatcca 
cacagtatgc 
tgcatccaca 
agaagcactt 
cggggtccaa 
cttqctacag 
ttccgttqtt 
acaaaattca 
tcqcttcgct 
agqgcqgt tg 
tgtcatttgt 
atactcaqac 
eg tgtcogca 
tgqt ta tgag 
tcoagqtgac 
ttqgctictgg 
gaacnttcca 
acctcatcca 



A:\Neb-165a.app 

N:\CRF3\013 02 001\I7 016 26.raw 



agcccttcct 
tgggg taaag 
ttcqatcacq 
qtttacgatg 
ggcgccgata 
agccacgaag 
cca tccctcc 
ct: qqctattt 
tcagatgtca 
taoattgtag 
aacttcacg t 
qctqgataqt 
qggaccgtqa 
agataccqag 
aatactgaaa 
qcctttcgct 
cccagtgaa t 
ggcccqgttt 
gagaqgatta 
tggt tcaaa t 
taga tgcaaa 
agaggacagg 
ttgcacaqag 
ccttcaggcg 
a tcaaccaaa 
qcgaa taagg 
gctacaaqac 
taacaattgg 
ccaaacg tta 
cgccattaaa 
cgctttgctc 
acqccgaaqt 
ctaggqgaaa 
gaaqgctgca 
gat.cagtgta 
acatcagcaa 
qccagtaata 
ctgatgccat 
ctacttacgt 
cac tcgggac 
a tq tea a gat 
actttgtttg 
ggqtggggtt 
gttgg cca to 
ctgccaaact 
tacatqttcg 
acqcgoctgg 
tcgaqgagat 
ggcgctctca 



caaatagcct 

agcaatctgt 

gcgaqgctaq 

qcaaaaaatt 

ccttqcaaga 

gcaacaa taa 

cgtqcg taa t 

atgaqcgctq 

gggggctaqa 

gctcagatg t 

tcgtccaaaa 

caaaagctgc 

ctgaaactga 

aaatttttga 

cattgctatt 

tgqqqccgcc 

taaqcgctca 

ctgac tcaca 

ttgaqgtttq 

cqctcgctcc 

taacttqagq 

gctgcqgctg 

actggtgacq 

cqtaacttcc 

tacaaatcac 

aactcaactg 

attcgaggtt 

ttaaaaccgt 

ggtaaccaaq 

ggacggtttt 

tactacttgc 

qtactqccaq 

gatgtaattg 

gqqcgcctgt 

ttgaggggta 

aggtacttca 

tattttttqa 

taaactaccg 

ccaag ttgaa 

cqqctaaa tt 

ttgcgctcqc 

tctgtttcat 

ttqttqgaag 

ttcttattga 

atgaaaaaag 

gcttqctcct 

cgctccgata 

gcaaaaqcgc 

gcccgagcct 



tatcgqgaat 

a a tea tea ga 

ctca ttcgaa 

tgaeatcttc 

aattgtagaa 

aataataatt 

a tcactqag t 

eggtgcactg 

qtcgcagcta 

ggcagagcct 

gcaa tgctaa 

qcttttgtct 

qaaaatqgtg 

agtcctegaa 

tcagcttcga 

qccagtaatt 

tcta tccaat 

qtattcggca 

caaccgtgtc 

gctcqctggg 

qgcacatgca 

agt ctgttat 

taagaaatcg 

ceattgcgtt 

cata tocgca 

ggcaaaaaat 

ttcccctgct 

tcqcttcgct 

ggaaatteac 

atagtaaa tt 

t:gttatgctc 

cq tccccccq 

qgaggtgcta 

gg tqcagaac 

ege ttggg ta 

atcatcgaat 

gcctatqaaa 

catactcccc 

gtaggcagtt 

cggcccctta 

gttgatteac 

ctacetcaac 

agttgtctgc 

aqcggcag tc 

aaggacttgt 

cgqqgqtqtg 

tgcgctgttt 
tgctgcgcgc 
gactggctgt 



gccattacag 

gacctaataa 

a tatccatat 

caacttcttg 

aaatatggca 

cacaggctc t 

qaatacgatg 

tacataa ttc 

gccqcctttg 

acaaqaaaaq 

caatgcqctt 

gtccqttagc 

ggtaagttcg 

tccagtaaoc 

gqggctgata 

t eg t ttc cca 

ttttcatLct 

qgccaggtgg 

tqtgcttccc 

aceggca tag 

aqac Lttggg 

tgaacgtgtt 

gcttcaaa ta 

gcaqcaagac 

gtttcggqgc 

cgcgcaaaga 

tgaatacagg 

cactqggacc 

ttgagttgtt 

tea tcqgact 

ctcqctgcat 

ccctctttac 

tttccgtcag 

caqccttct t 

gcggtttttc 

aaaattttcg 

tgtcaatccq 

tagta gcg qq 

taacaactgg 

ggcaaacgtt 

qgagtaccaa 

cgattcgaqa 

qca tctatag 

aactqggqgc 

agtagctgca 

cttqgcqccq 

cgcggcgaaa 
CO tc tacaaa 
ggctatcaac 



1O3B0 
10440 
10500 
10 560 
10 62 0 
10680 
10740 
10800 
10860 
10920 
10980 
n 040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
1 1500 
11640 
11700 
11760 
1 1820 
11880 
11940 
12000 
12060 
12120 
121B0 
12240 
12300 

12 360 
12420 
12480 
12540 
12600 
126 6 0 
12720 
12780 
1284 0 
12900 
12960 
13020 

13 080 
13140 
13200 
13260 
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<210> 4 
<211> 42143 
<212> DNA 
<213> Unknown 



<220> 

<223> Description of Unknown Organism: Genomic DNA of 
Pseudomonas Alcaligenes NEB #585 (ATCC 55044) 

<400> 4 

atcgatcagc cagacttttc gcacacgggc ggaccttggg cgagtcagcg ctatggttgg 60 
ccgctgtggg ttgtcagtgc ccgtacgcgc aatctgtttc tttcgcaggg catgtccggc 120 
tgggcgttcc ggcccgttct ggtcaccgac tcggctctct atgagcgcta tctcgctcta 180 
agtcaggaac tttgcgcact gcttcgtgat gcaccgcaga gcaagctcga agaccgtgat 240 
tggtaagcgg gggctattcg atcagtctcg gagcgaccaa actccagaaa cgacaaggcc 300 
ctgaaaaaaa agcagggctt cgtctttgcg ggcgaatgga atcggacctc tttccgcctc 360 
tgcatgtaac tggtctttgt ttgccaaatc tgcctatctc atgccggcca tgttggccag 420 
tgcctgcatc atttggcctt tggtttcgac actttttcga cagccctgct agacatccct 480 
ccctctgccc tcgtaacttc tgttccgatg gtgtcgcttg gcactatggt cttgtcgagt 540 
gtcgcttttc atccagccta atgccgcgat tgcctcgctg agctgtagct gaatcaagga 600 
cttagcggac gacaaggaat gttatgcgaa acatgtggcg gaataaatta cgccgcatgt 660 
ttcgtctact tatagttagg ctacatatga gaatcagcgc agaccagctt gctcaagaat 720 
cactgactga gttcggcgtg ctggcggcta agcttctggc aacgcgagag cttagccagt 780 
tgtccgagaa gtttgggtat gcactggcct tcggaaggga accggcggct gccatagctg 840 
aggaccttgc taggtgcttg tgcggacaaa atgcttcgcc ggcatctgaa taccccaaaa 900 
tcaccgttaa gtatttcaag gaaaacgaaa gtagtctgtt ggcactcgta gagtgttatg 960 
tacaaatgac cgcaagcgca aacattcttt tagagct'ggt tgccgcacga aatggagagg 1020 
caataaatct gtatctagaa ggcttgagtg ttgtagccta acaatgcgct caaagcgctc 1080 
acttcgttcg ctgggaccgg cgaagccggc cccttagctt aatcgttaga aaccatcatg 1140 
gataactggt acaacaccat cgaataccaa acccatgtag ccgaaaaact agaggcactt 1200 
ggagaaacaa agtacgaccg cgaggcttat gaaittcgcgc tagaggcata ccagtatgcg 1260 
cctgaatatc atgaaaatat tcccacgccg cctctcaatc ttgggctcgc gtaccatgta 1320 
agcgccttca actttgcaca ctgctatgta cttcacgcta aagaagtgtt tgaagctcca 1380 
aaagacacac tgagctcctg gggcgtattt tcctcaacgg acattggtga aattgtttat 1440 
ggtttagtcc gtattggctt gctggaccaa ggccccgaag acaaaaaaga gcagtttgaa 1500 
gggttgtttt taatcaccga cgtgctgtga tgtcttctaa ctactggttc aagtcgttcg 1560 
cttcgctcac tcgggaccgg ctaaagccgg ccccttaacc aaacgttagc cacctcacga 1620 
agatttggag cccgcgtgaa caaagtcgat acaaacaaaa ttaaaacgga tttttcggca 1680 
cgaattgatg aaaaaagagc gtggtttgat cgtatggcta cgcttataag cgggacaaac 1740 
accgagttaa ccgaccttaa ttttctttgc gagaactata taacatcaat atacgtagag 1800 
ctcgaatgct taatatcaga tttatttcat ggctacataa ataacaacaa caagacctac 1860 
atggcgcaca ttcaatcaaa aatcaagaac tccataactg acaagtactc tgcatggcac 1920 
gccacccata caacattcgc aggtccagag catattaatt cagcacagct cagcacgctc 1980 
cttgatccaa caagctggaa catcacattt aaagacgttt ccgcaatgaa agtacgagca 2040 
aaggaatacc tttcctcagt acacgaaaaa agattttcag gtatatctgc atccgatgga 2100 
gctcttattg atgccgcaca tgcaatcaga aattgcattg cacacaacag cgaaagctcc 2160 
agaaaggtta tgaacaccaa aattaaaagc ttaattacag gcccagcttg ctcaaatgtc 2220 
ggccttgaac tcaccacaaa tagtgtgacc aaaataggaa agtatctccg tgcaaatgct 2280 
cagcaaagca tgcgagtgct gatttactca gatcgaataa aatctatcgg cctaagctta 2340 
taagtgtggg ctaacaatgc gctcaactgt cgctcacttc gttcgctgga cagccaaaag 2400 
ctacgctttt gtctgcccgt tagcttaatc gttaggaggc tctgcatgac tcgtgcaaca 2460 
gacaggttcg aagagcttct gcaatcacat gagttctcag ggcatattat tcgttgggtt 2520 
gcgatattcg aaggccgtct tgacggtgtg ttatcagttc atttttctgg acttgaaagc 2580 
acctatgaat tctacgaact catactttcc aggttgtctt tctacgaaaa aattgaaatc 2640 
ctgagaaaaa ttgattttgg taacagtctc aaatcccaag aaaatacagc gctgcaccta 2700 
gacaaactga ggcgattgcg taacgcattg gcgcatgcag cacacatgcc acctgatgaa 2760 
atcatgaagt tgtgctctga taagtggata gagtcctttg tgctcggata tccaaagtcc 2820 



4 O'ffyo/^lix 1 



gctcggcaat ttggctctgg acgcgcctgg cgctgcgata^ tgcgctgttt cgcggcgaaa 13140 
actgatagct gaaccttcca tcgaggagat gcaaaagcgc tgctgcgcgc catctacaaa 13200 
gacccgaagc acctcatcca ggcgctctca gcccgagcct gactggctgt ggctatcaac 13260 
acctcttcga taccactacc cgccagaaac gacaaagccc tgcaaaaagc agggctttgt 13320 
ctttggggat ctggagcggg cgaagggaat cgaaccctcg tcatgagctt gggaagctca 13380 
ggtaatgcca ttatacgacg cccgctcggg cggctgactt tttaccagaa tcgcccggga 13440 
aggtgaagcc gggcgcgcgt cttgcgcccg ttttattgcc gggcgcttca tagcgccacg 13500 
gcccgtggct ctcgttccac gctgcgtgcg tggccctgcg tgggtgccag caggaaggcc 13560 
agcagggcat cgcgggtctg catccaggcg gccttgtgtt ccatgtcgag gaagtggccg 13620 
gcctgggcga tggtgcggaa ctcgcagtgg cgcacgtact gggtgaacag gcgcgcgtca 13680 
gccggggtgg tgtactcgtc ccactcgccg ttgacgaaca gcagcggtat ctcgatctgc 13740 
ccggcgaagc tgacgcagga gcgcccgccg ttgttcagca cggtttccac gtggtgactc 13800 
atttgctcat attcatagcg ctccaggccg gtgacgtgtc gatggttgta gcgcttgaac 13860 
agcgagggca ggtgcttgcc gatggtgccg ttgagcacca tgccgatgct ctcgcggtcg 13920 
cactcgcgca tcaccaccag gccggcgcgc aggtagccga gcatggcgct gttgacgatc 13980 
ggcgagaagg agttgatcac cgcacgctcg atccgcgatg gacgccgggc cagcgcctgg 14040 
agggtggcga tgccgcccca ggagaaagga cagcacgctg ttcgcagg cg aaaatgttcg 14100 ^ 
accagctcca ggaagatgtc ggcttctttc ctcgcsgctg aaggrbsc^n)^ agcttctggt 14160 / 
acgaacctgg gggcgctccg gcacgcacaa gggcatcgac atcttcgccc gccagggcac 14220 ^^^^i/yy^ 
cccggtgctc gcccccagct acggcatcgt ggtgtttcgc gacgagctcg acatgggcgg 14280 ^ 
caaggtactg ctgatgctcg gccccaaatg gcgcctgcac tacttcgccc acctcgacag 14340 ^ C 
ctacagcgcc ctgcccggcc aacccgtact tcccggcgcc ccactcggca cggtaggcag 14400 ^/l c/t/"^ 
caccggcaac gcccagggca agccgcccca tctgcactac tcgatcgtca ccctgttgcc 14460 
ctatccctgg cgctgggaca acagcactca gggctggaag aaaatgttct acctcgaccc 14520 
cacgccaatg ctgaacgaag cggcagtaga cagccgaaaa accagccagt agcgtcgcag 14580 
gggaatgcac caccggtctt gcccgatccg cctgtccttt taccaatcgc agaagagtcg 14640 
cttttgtcga atcgcctgtg aggaaaaaca aggacttgct ggacgacaag gaacgttatg 14700 
cgacacaagt ggcggaataa attacgccat ttgtgtcgtc tacttatagt tatatgctga 14760 
tctagatatg aagtacaaaa acataaaatc agcaatccac aatttcgggc acagctttgt 14820 
aagctcagtg aactatgttg accatgattt cgttgccgac gaaattggga agattcacaa 14880 
gaaaggctat gatattgaaa taaactggct tacaagggag ttcaagcccg ctcagcttga 14940 
gtcagagaga ataaaaaaat caattggtta ttggggtgac aacctaaaga aacattgtgc 15000 



^iAAMX ^^^^^^ y^^Ji^^ A^lsruyn p JJp^^ 



Please Note: 

Use of n and/or Xaa have been detected in the Sequence Listing. Please review the 
Sequence Listing to ensure that a corresponding explanation is presented in the <220> to 
<223> fields of each sequence which presents at least one n or Xaa. 





VERIFICATION SUMMARY 

PATENT APPLICATION: US/09/7 01,6 26 



DATE: 01/3 0/2001 
TIME: 11:15:34 



inpuL set : A:\ITeb-165a.app 

Output Set: N:\CRF3\01302001\l701626.raw 



L:12 M:270 C: Current Application Number differs, Replaced Current Application Number 

L:13 M:271 C: Current Filint) Date differs, Replaced Current Filing Date 

L:897 M:258 W: Mandatory Feature missinq, <221> not found for SEO iD#:4 

L:897 H:258 W: Mandatory Feature missing/ <222> not found for SEQ 1D#:4 

L:897 M:340 W: (46) "n" or "Xaa" used: Feature required, for SEQ iD#;4 

Jj:1019 M;258 W; Mandatory L'eature missing, <221> not found for SEQ ID^t : 4 

l.:1019 M:258 W: Mandatory Feature missing^ <222> not found for SEU ID#:4 

M:340 Repeated in SeqNo--4 

L: 1:141 M:258 W: Mandatory Feature missing, <2 21> not found for SEQ IDtt:4 
L:1141 M;25B W; Mandatory Feature missing, <2 22> not found for SEQ IDj^:4 
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